Procedural outline

Turn on machines / heaters. Put mice in tailcuff room and let the room and mice acclimate to appropriate temperature for ~30-60 mins. Then, check cuffs for leaks. Put mice into restraints and perform 5 acclimation cycles + 20 recorded cycles.

When placing mice into restraints,
1. Quickly place nose cuff on to avoid letting them turn around in the restraint
2. Make sure that you can see them breathing…
3. Maximize tightness of fit and breathing

Metadata Analysis

This workbook is set up to analyze two groups of mice! Just run and enjoy (you’ll probably need to change out drug names…)

Metadata
group Specimen Name Mouse Unique ID Gender Notch DOB Wean Date CageID Current Age (weeks) Date ready for Telemetry implant New CageID Status Date of death Machine ID Average body weight (g)
vehicle M1 1A Male NN 2021-03-25 2021-04-20 662055 9.285714 2021-05-20 NA Alive NA 1 26.4
sorafenib M2 2A Male RN 2021-03-25 2021-04-20 662055 9.285714 2021-05-20 NA Alive NA 1 24.9
sorafenib M3 3A Male LN 2021-03-25 2021-04-20 662055 9.285714 2021-05-20 NA Alive NA 1 26.2
vehicle M4 4A Male BN 2021-03-25 2021-04-20 662055 9.285714 2021-05-20 NA Alive NA 1 25.7
vehicle M5 5A Male DR 2021-03-25 2021-04-20 662055 9.285714 2021-05-20 NA Alive NA 1 27.5
sorafenib M6 1B Male NN 2021-03-30 2021-04-13 662047 8.571429 2021-05-25 NA Alive NA 2 25.1
vehicle M7 2B Male RN 2021-03-30 2021-04-13 662047 8.571429 2021-05-25 NA Alive NA 2 25.7
vehicle M8 3B Male LN 2021-03-30 2021-04-13 662047 8.571429 2021-05-25 NA Alive NA 2 24.9
sorafenib M9 4B Male BN 2021-03-30 2021-04-13 662047 8.571429 2021-05-25 NA Alive NA 2 23.4
sorafenib M10 5B Male DR 2021-03-30 2021-04-13 662047 8.571429 2021-05-25 NA Alive NA 2 25.3

Inspect accepted cycles and changes in mouse body weight over time

Average animal body weight

Joining, by = “Specimen Name”

Animal body weight change over time

Joining, by = “Specimen Name”

Blood Pressure Data Analysis

Filtering out days that had less than ‘x’ cycles

Removed days/Specimens with less than 5 cycles:
Specimen Name Date group # cycles reason
M3 2021-05-25 sorafenib 1 Low cycle count
M7 2021-05-25 vehicle 4 Low cycle count

Removing outliers

Detect outliers BP measurements using boxplot methods. Boxplots are a popular and an easy method for identifying outliers. There are two categories of outlier: (1) outliers and (2) extreme points. Values above Q3 + 2xIQR or below Q1 - 2xIQR are considered as outliers. Q1 and Q3 are the first and third quartile, respectively. IQR is the interquartile range (IQR = Q3 - Q1). This method is more robust than STDEV based outlier detection because outliers can skew the mean and STDEV of a sample.

Here, outliers are nominated based on daily blood pressure recordings, so as to not throw out data on treatment days when the blood pressure is expected to rise above the average.

Additionally, we remove mice that are too ‘volatile’ after trianing period has finished.

Plot the data over time and visualize the variance per day, per sample with boxplots, over all days

Pilot 2 | sorafenib 100 mg/kg/d

Dates
Phase first last date_range Number of Days
training 2021-05-25 2021-05-28 2021-05-25 to 2021-05-28 4
vehicle 2021-05-29 2021-05-29 2021-05-29 to 2021-05-29 1

Assess BP of randomly assigned groups before getting treatment

3-day rolling average

Take the last 3-days of an interval and average them for more realistic averages

summarise() has grouped output by ‘Specimen Name’, ‘group’. You can override using the .groups argument. summarise() has grouped output by ‘group’. You can override using the .groups argument.

Time-series data

For completeness, here is the time series data of each mouse across each Phase of the experiment:

Time-series diff by individual mouse from vehicle average to end of treatment

Time-series diff by group

These are the average blood pressures of each mice across each Phase of the experiment

Booklet usage instructions

Download excel data after finishing the experiments onto thumb drive

Copy the data into a master excel file with two sheets, making sure that there’s only one header (at the very top of the page).

  1. You’ll need to add in two columns manually to this master sheet: Date and Phase. Phase can take one of 4 values: “training”, “baseline”, “vehicle”, “treatment”. Training data ultimately gets removed, but included in the data sheet for completeness. First sheet should look like this:
  1. Second sheet should be created manually based on your mice. Fill in the various fields.
---
title: Piloting the administration of Sorafenib 100 mg/kg/d via oral gavage and measuring
  blood pressure by the tail-cuff method
author: "Nick Camarda"
date: "`r format(Sys.time(), '%d %B, %Y')`"
output:
  html_document:
    df_print: paged
---

## Procedural outline
Turn on machines / heaters. Put mice in tailcuff room and let the room and mice acclimate to appropriate temperature for ~30-60 mins. Then, check cuffs for leaks. Put mice into restraints and perform 5 acclimation cycles + 20 recorded cycles. 

When placing mice into restraints,  
  1. Quickly place nose cuff on to avoid letting them turn around in the restraint\
  2. Make sure that you can see them breathing...  
  3. Maximize tightness of fit and breathing

```{r read_data,  echo = FALSE, message=FALSE, warning=FALSE}


library(tidyverse)
library(rstatix)
library(ggthemes)
library(ggforce)
# library(ggstatsplot)
# library(gapminder)
library(ggsci)
library(ggpubr)
library(rstatix) # identify_outliers

library(readxl)
library(forcats)

library(knitr)
library(kableExtra)

library(gridExtra)
library(GetoptLong)

# library(experiment)
library(randomizr)

knitr::opts_chunk$set(echo = FALSE, message = FALSE, 
                      warning = FALSE, results = "asis", fig.width = 10, fig.height = 8,
                      fig.path = "README_figs/README-")

fct_phases <- c("training", "baseline", "vehicle", "treatment", "treatment 2x")
tx_phases <- c("treatment", "treatment 2x")



my_theme <- theme_stata() + 
  theme(plot.title = element_text(face = "bold", size = 20),
        plot.subtitle = element_text(face = "italic", color = "black"),
        axis.text = element_text(size = 12),
        axis.title = element_text(size = 14, face = "bold")) 



# attach thumbdrive first
# data_dir <- "/Volumes/NDC2021/tailcuff"
# project_dir <- "sor-pilot2"
# 
# raw_data_dir_full <- file.path(data_dir, project_dir)
# raw_data_tbl <- tibble(fn = list.files(raw_data_dir_full, full.names = TRUE)) %>%
#   mutate(Date = str_split(string = fn, pattern = "-a_|-b_|\\.", simplify = TRUE)[,2]) %>% 
#   unnest() %>%
#   mutate(Date = as.Date(Date, "%m_%d_%Y")) %>%
#   mutate(dat = map(fn, read_csv)) %>%
#   unnest() %>%
#   select(-fn) %>%
#   relocate(Date, .after = last_col()) %>%
#   mutate(Phase = "training") %>%
#   arrange(Date)
# 
# 
# write_csv(raw_data_tbl, "~/Downloads/temp.csv")

#' reads in mega excel file 
data_dir <- "/Users/ncamarda/OneDrive - Tufts/phd/projects/vegfri/sorafenib/mouse-studies/tail-cuff"
fn_name <- "Pilot 2 - sorafenib oral gavage - dosage 100 mgkgd - MASTER.xlsx"
full_path_fn <- file.path(data_dir, fn_name)

split_fn_name <- str_split(string = fn_name, pattern = " - ", simplify = TRUE)
trial_name <- split_fn_name[,1]
drug_name <- str_split(string = split_fn_name[,2], pattern = " ", simplify = TRUE)[,1]
drug_dosage <- str_split(string = split_fn_name[,3], pattern = " ", simplify = TRUE)[,2]


# Extract the trial num from fn_name, number next to "pilot"
trial_num <- str_split(string = fn_name, pattern = " ", simplify = T)[,2]

#' other removed fn denotes other reasons why you might not want to include a mouse's data for a specific day
other_removed_fn <- qq("./other_removed-@{drug_name}-@{trial_num}.csv")
if (!file.exists(other_removed_fn)) {
  file.create(other_removed_fn, showWarnings = FALSE)
}

shape_id_df <- tibble(`Specimen Name` = str_c("M",seq(1, 10, 1)), 
                      shape_id = c(0:6,10:12)) # c(0:6,10) 

## Read in the data
data_temp <- read_excel(full_path_fn, sheet = 1) %>% 
  dplyr::select(`Specimen Name`, Systolic, Mean, Rate, `Cycle #`, `Date`, Phase) %>%
  mutate(Date = as.Date(Date, "GMT")) %>%
  arrange(Date, `Specimen Name`, `Cycle #`) %>%
  mutate(`Specimen Name` = factor(`Specimen Name`, levels = str_c("M", seq(1, 10, 1)))) %>%
  # filter(Phase != "training", Phase != "tx1") %>%
  mutate(Phase = factor(Phase, levels = fct_phases)) %>%
  arrange(Phase, `Specimen Name`) 

meta_df_all_temp <- read_excel(full_path_fn,
                               sheet = 2) %>%
  mutate(Date = as.Date(Date, "GMT"),
         DOB = as.Date(DOB, "GMT")) %>%
  arrange(Date, `Specimen Name`) %>%
  mutate(`Specimen Name` = factor(`Specimen Name`, levels = str_c("M", seq(1, 10, 1))),
         `Date` = as.factor(Date),
         `Body weight (g)` = as.numeric(`Body weight (g)`)) %>%
  group_by(`Specimen Name`)


dead_mice <- meta_df_all_temp %>% 
  distinct(`Specimen Name`, `Date of death`) %>%
  na.omit() 


# meta_df_all_temp_copy <- apply(dead_mice, 1, function(r) {
#   # r <- dead_mice[2,]
#   sn <- as.character(getElement(r, "Specimen Name"))
#   dod <- as.Date(getElement(r, "Date of death"))
# 
#   loop <- apply(X = meta_df_all_temp, MARGIN = 1, FUN = function(rr) {
#     # rr <- meta_df_all_temp[1,]
#     fsn <- as.character(getElement(rr, "Specimen Name"))
#     fd <- as.Date(getElement(rr, "Date"))
# 
#     if (fsn == sn & dod < fd) remove <- TRUE
#     else remove <- FALSE
#     return(remove)
#   })
#   return(meta_df_all_temp[!loop,])
# })

meta_df_temp <- meta_df_all_temp %>% 
  summarize(`Average body weight (g)` = mean(`Body weight (g)`, na.rm = T), .groups = "keep") 


meta_df_all <- meta_df_all_temp
meta_df <- inner_join(meta_df_all %>% 
                        dplyr::select(-`Body weight (g)`, -`Date`) %>% 
                        distinct(), 
                      meta_df_temp) %>%
  arrange(`Specimen Name`) %>%
  anti_join(dead_mice %>% distinct(`Specimen Name`))

# block random assignment by machine

set.seed(2222)
num_mice <- nrow(meta_df);
# https://cran.r-project.org/web/packages/randomizr/vignettes/randomizr_vignette.html


meta_df_w_assign <- meta_df %>% 
  bind_cols(tibble(group = complete_ra(N = num_mice, prob = 0.5))) %>%
  mutate(group = ifelse(group == 0, "vehicle", drug_name)) %>%
  mutate(group = factor(group, levels = c(drug_name, "vehicle"))) %>%
  arrange(`Specimen Name`, group) %>%
  dplyr::select(group, everything())

num_mice_per_group <- meta_df_w_assign %>% 
  group_by(group) %>%
  summarize(n = n())

num_mice_per_group_str <- num_mice_per_group %>% 
  group_by(group) %>%
  mutate(n_str = map2_chr(group, n, function(a,b) qq("N @{group} = @{n}"))) %>%
  pluck("n_str") %>% 
  str_c(collapse = " | ")


# merge data and meta
data <- data_temp %>% 
  right_join(meta_df_w_assign) %>% # inner join? right join?
  right_join(shape_id_df) %>%
  mutate(color = ifelse(group == "vehicle", "#1e81b0", "#ed9942")) %>%
  mutate(color = factor(color, levels = c("#ed9942", "#1e81b0"))) %>%
  mutate(`Specimen Name` = factor(`Specimen Name`, levels = unique(.$`Specimen Name`))) 
# %>%
#   filter(`Specimen Name` != "M2")

# rand_dat <- randomize(data, match = "Systolic")

shape_ids <- unique(data$shape_id)

sun_color <- which(meta_df_w_assign$group == drug_name)
veh_color <- which(meta_df_w_assign$group == "vehicle")
```

## Metadata Analysis
This workbook is set up to analyze two groups of mice! Just run and enjoy (you'll probably need to change out drug names...)

```{r plot_meta, echo = FALSE, message=FALSE, warning=FALSE}
kable(meta_df_w_assign, caption = "Metadata", align = "c") %>%
  kable_styling("striped") %>%
  row_spec(sun_color, bold = T, color = "white", background = "#ed9942") %>%
  row_spec(veh_color, bold = T, color = "white", background = "#1e81b0")


# table(tx = meta_df_w_assign$group, machine = meta_df_w_assign$`Machine ID`

```


## Inspect accepted cycles and changes in mouse body weight over time
```{r accepted_cycles, echo = FALSE, message=FALSE, warning=FALSE, fig.height=8}
remove_bad_days_df1 <- data %>%
  group_by(`Specimen Name`, Date, group, color) %>%
  summarize(n = n()) %>%
  arrange(Date, `Specimen Name`) 

if (nrow(remove_bad_days_df1) != 0){
  g1 <- ggplot(remove_bad_days_df1, aes(color = `group`, x = Date, y = n)) +
  geom_point(show.legend = F) + 
  geom_line() + 
  geom_smooth(method = "lm", color = "cyan", linetype = "dashed") +
  # geom_bar(position = "dodge", stat = "identity") +
  scale_color_manual(name = "group", values = levels(remove_bad_days_df1$color)) +
  scale_x_date(breaks = unique(remove_bad_days_df1$Date)) + 
  my_theme +
  scale_y_continuous(breaks=seq(0,35,5)) +
  theme(axis.text.x = element_text(angle = 60, hjust=1, size = 5)) +
  ylab("Number of Accepted Cycles") +
  ggtitle("Accepted cycles per mouse over time") + 
  facet_grid(cols = vars(`Specimen Name`))
g1
}


```

### Average animal body weight
```{r avg_bodyweight, fig.height=8, fig.width=8}

meta_df_weight <- left_join(meta_df_all, 
                            meta_df_w_assign %>% 
                              ungroup() %>% 
                              distinct(`Specimen Name`, group)) %>%
  filter(Status == "Alive")
g2 <- ggplot(meta_df_weight, aes(fill = `group`, x = `Specimen Name`, y = `Body weight (g)`)) +
  geom_point() + 
  geom_boxplot(position = "dodge", alpha=0.8) +
  stat_summary(fun="mean", color = "cyan", show.legend = F) + # position = "dodge", stat = "identity"
  scale_fill_manual(name = "group", values = levels(remove_bad_days_df1$color)) +
  my_theme +
  scale_y_continuous(breaks=seq(20,30,2)) +
  ylab("Body weight (g)") +
  ggtitle("Average animal body weight (g)") + 
  geom_text(data = meta_df_w_assign, 
    aes(x = `Specimen Name`, y = `Average body weight (g)`, 
        label = round(`Average body weight (g)`,1), vjust = 2.5), 
    hjust = 0.5, color = "black", size = 5, 
    inherit.aes = FALSE) 
g2

dates <- unique(meta_df_all$Date)
reps <- length(unique(meta_df_all$`Specimen Name`))
date_tbl <- tibble(Date = rep(dates, reps))
```

### Animal body weight change over time
```{r weight_change, fig.height=8, fig.width = 12}       
weight_change_df <- meta_df_all %>%
  arrange(Date) %>%
  group_by(`Specimen Name`) %>% 
  summarize(`Weight change (g)` = `Body weight (g)` - `Body weight (g)`[1L]) %>%
  bind_cols(date_tbl) %>%
  mutate(Date = as.Date(Date)) %>%
  left_join(data %>% distinct(`Specimen Name`, group))

min_weight_chg <- weight_change_df %>% na.omit() %>% .$`Weight change (g)` %>% min()
max_weight_chg <- weight_change_df %>% na.omit() %>% .$`Weight change (g)` %>% max()
ggplot(weight_change_df %>% na.omit(), aes(x = `Date`, y = `Weight change (g)`, color = `group`)) +
  geom_point(show.legend = F) + 
  geom_line() + 
  geom_smooth(method = "lm", color = "cyan", linetype = "dashed") +
  scale_color_manual(name = "group", values = levels(remove_bad_days_df1$color)) + 
  my_theme +
  scale_fill_manual(name = "Weight change (g)", ) + 
  scale_y_continuous(breaks = seq(min_weight_chg, max_weight_chg)) +
  scale_x_date(breaks = unique(weight_change_df$Date)) +
  theme(axis.text.x = element_text(angle = 75, size = 6, vjust = 0.5),
        axis.text.y = element_text(hjust = 0.55),
        panel.spacing = unit(2, "lines")) +
    labs(title = "Weight change (g) since beginning of experiment\n") +
  facet_grid(cols = vars(`Specimen Name`))
```

```{r change_weight_avg}
# daily average weight loss
daily_lost_weight_avg_df <- weight_change_df %>% 
  na.omit() %>% 
  group_by(Date) %>%
  summarize(weight_loss_avg = mean(`Weight change (g)`))

final_avg_weight <- daily_lost_weight_avg_df %>% slice(n()) 
  
ggplot(daily_lost_weight_avg_df, aes(x = `Date`, y = `weight_loss_avg`)) +
  geom_point(show.legend = F) + 
  geom_line() + 
  geom_smooth(method = "lm") +
  my_theme +
  geom_label(data = final_avg_weight, 
             label = round(final_avg_weight$weight_loss_avg, 2), 
             vjust = 1.5, color = "red") +
  ylab("Average daily weight change (g)") + 
  # scale_y_continuous(breaks = seq(-10,10,.2)) +
  scale_x_date(breaks = unique(weight_change_df$Date)) +
  theme(axis.text.x = element_text(angle = 75, size = 8, vjust = 0.5),
        axis.text.y = element_text(hjust = 0.55),
        panel.spacing = unit(2, "lines")) +
    labs(title = "Average weight change (g) across all mice\nsince beginning of experiment\n")  
```

## Blood Pressure Data Analysis
### Filtering out days that had less than 'x' cycles
```{r bad_days1, echo = FALSE, message=FALSE, warning=FALSE}
FILTER_OUT_LESS_THAN_CYCLES <- 5
removed_df_temp <- inner_join(remove_bad_days_df1 %>% filter(n < FILTER_OUT_LESS_THAN_CYCLES), 
                         remove_bad_days_df1) %>%
  dplyr::select(-color) %>%
  mutate(reason = "Low cycle count")

other_df <- read_csv(other_removed_fn)
removed_df <- bind_rows(removed_df_temp, other_df) %>% arrange(Date)

# print("Removed days/Specimens:")
kable(removed_df %>% rename(`# cycles` = n),
      caption = qq("Removed days/Specimens with less than @{FILTER_OUT_LESS_THAN_CYCLES} cycles:")) %>%
  kable_styling("striped")
 
# filtered df
bad_days_removed_df <- anti_join(data, removed_df) 

```

### Removing outliers
Detect outliers BP measurements using boxplot methods. Boxplots are a popular and an easy method for identifying outliers. There are two categories of outlier: (1) outliers and (2) extreme points. Values above Q3 + 2xIQR or below Q1 - 2xIQR are considered as outliers. Q1 and Q3 are the first and third quartile, respectively. IQR is the interquartile range (IQR = Q3 - Q1). This method is more robust than STDEV based outlier detection because outliers can skew the mean and STDEV of a sample. 

Here, outliers are nominated based on *daily* blood pressure recordings, so as to not throw out data on treatment days when the blood pressure is expected to rise above the average. 

Additionally, we remove mice that are too 'volatile' after trianing period has finished.

```{r outliers, echo=FALSE, fig.height=10, message=FALSE, warning=FALSE}
fences_df <- bad_days_removed_df %>%
  ungroup() %>%
  dplyr::select(`Specimen Name`, Date, `Cycle #`, Systolic) %>%
  dplyr::group_by(`Specimen Name`, `Date`) %>%
  dplyr::summarize(median = median(Systolic),
                   iqr = IQR(Systolic),
                   min = min(Systolic),
                   max = max(Systolic),
                   first_quantile = quantile(Systolic, 0.25),
                   third_quantile = quantile(Systolic, 0.75)) %>%
  mutate(lower_fence = first_quantile - iqr*2,
         upper_fence = third_quantile + iqr*2) %>%
  distinct(`Specimen Name`, Date, lower_fence, upper_fence)


mark_outliers_df <- bad_days_removed_df %>%
  group_by(`Specimen Name`, `Date`) %>%
  left_join(fences_df, by = c("Specimen Name", "Date")) %>%
  mutate(is.outlier = Systolic > upper_fence | Systolic < lower_fence) %>%
  dplyr::select(is.outlier, everything())

count_outliers <- mark_outliers_df %>% 
  group_by(`Specimen Name`) %>%
  summarize(`# Outliers` = sum(is.outlier)) %>%
  filter(`# Outliers` != 0) %>%
  arrange(`Specimen Name`)

# kable(count_outliers, caption = "Specimens by # of outliers", align = "lc") %>%
#   kable_styling("striped")  


# remove all rows that are outliers
# set factor 
final_filtered_data <- mark_outliers_df %>%
  filter(!is.outlier) %>%
  group_by(`Specimen Name`, `Date`, Phase, color, group) %>%
  mutate(Phase = factor(Phase, levels = fct_phases)) %>%
  arrange(Phase)


#' @note analysis of outlier removal
with_outliers_g <- ggplot(mark_outliers_df, aes(x = `Specimen Name`, y = Systolic, color = group)) + 
  geom_boxplot() + 
  scale_color_manual(name = "group", values = levels(mark_outliers_df$color)) +
  geom_point(data = mark_outliers_df %>% filter(is.outlier), aes(x = `Specimen Name`, y = Systolic), color = "magenta", shape = 17, size = 3) + 
  my_theme + 
  ggtitle("Outliers marked") + 
  facet_grid(~ Date)

# just plot the watermark of where the outliers used to be...
without_outliers_g <- ggplot(final_filtered_data, aes(x = `Specimen Name`, y = Systolic, color = group)) + 
  geom_boxplot() + 
  scale_color_manual(name = "group", values = levels(mark_outliers_df$color)) +
  geom_point(data = mark_outliers_df %>% filter(is.outlier), 
             aes(x = `Specimen Name`, y = Systolic), color = "magenta", shape = 17, size = 3, alpha = 0.3) + 
  scale_y_continuous(breaks = seq(50,220,20)) + 
  my_theme + 
  theme(axis.text.x = element_text(angle = 60, hjust=1, size = 7),
        axis.text.y = element_text(hjust = 0.5)) +
  ggtitle("Outliers removed") + 
  facet_wrap(~ Date, ncol = 5)
without_outliers_g


outlier_mice_df <- final_filtered_data %>% 
  filter(Phase != "training") %>%
  dplyr::select(`Specimen Name`, Date, `Cycle #`, Systolic) %>%
  dplyr::group_by(`Specimen Name`, Phase) %>% # across all dates
  dplyr::summarize(sem = sd(Systolic)/sqrt(num_mice))


```




## Plot the data over time and visualize the variance per day, per sample with boxplots, over all days
<!-- ### Trial 3 -->
  <!-- #### Starting 04/30/21 - oral gavage with 200 uL and tailcuff training begins -->
  <!-- #### 05/6/21 - vehicle treatment (10% DMSO / 40% PEG/ 5% Tween / 45% Saline) begins to establish baseline BP -->
  <!-- #### 5/10/21 - randomized to SORAFENIB treatment or vehicle  -->
  
  <!-- #### Starting 04/15/21 - oral gavage with 200 uL and tailcuff training begins -->
  <!-- #### 04/18/21 - baseline recording before vehicle treatment -->
  <!-- #### 04/21/21 - vehicle treatment (10% DMSO / 40% PEG/ 5% Tween / 45% Saline) begins to establish baseline BP -->
  <!-- #### 04/24/21 - randomized to sunitinib (80 mg/kg/d) treatment or vehicle  -->
  
  <!-- Trial 2 -->
  <!-- ### Starting 03/05/21 - oral gavage and tailcuff training begins -->
  <!-- ### Starting 03/08/21 - baseline recording before vehicle treatment -->
  <!-- ### Starting 03/11/21 - vehicle treatment (10% PEG/ 0.5% Tween /~90% DI H2O at pH ~3.5) begins to establish baseline BP -->
  <!-- ### Starting 03/15/21 - Randomized to Sunitinib (40 mg/kg/d) treatment or vehicle begins -->
  
```{r calc_important_dfs, echo = FALSE, message=FALSE, warning=FALSE}

#' @Note main df! this is specimen average
specimen_avg_data_df <- final_filtered_data %>% 
  summarize(`specimen_mean_systolic` = mean(Systolic), 
            `specimen_median_systolic` = median(Systolic),
            specimen_sd_systolic = sd(Systolic),
            specimen_sem = specimen_sd_systolic / sqrt(num_mice), 
            .groups = "keep") %>%
  # necessary?
  ungroup() %>%
  mutate(Phase = factor(Phase, levels = fct_phases))

cat(qq("\n## @{trial_name} | @{drug_name} @{drug_dosage} mg/kg/d\n"))
cutoff_dates_df <- specimen_avg_data_df %>%
  group_by(Phase) %>%
  summarize(first = first(Date), last = last(Date)) %>%
  mutate_all(.funs = as.character) %>%
  mutate(date_range = map2_chr(first, last, function(f,l){
    res <- str_c(c(f, l), collapse = " to ")
    return(res)
  })) %>%
  mutate(`Number of Days` = as.integer(as.Date(last) - as.Date(first)) + 1) 

kable(cutoff_dates_df, caption = "Dates", align = "c") %>%
  kable_styling("striped")



group_wise_bp_and_sem <- specimen_avg_data_df %>%
  group_by(`Specimen Name`, group, Phase) %>% 
  summarize(specimen_grp_phase_mean_systolic = mean(specimen_mean_systolic),
            specimen_grp_phase_sd_systolic = sd(specimen_mean_systolic),
            specimen_grp_phase_sem = specimen_grp_phase_sd_systolic / sqrt(num_mice)) 

phase_wise_bp_and_sem <- specimen_avg_data_df %>%
  group_by(group, Phase) %>% 
  summarize(grp_phase_mean_systolic = mean(specimen_mean_systolic),
            grp_phase_sd_systolic = sd(specimen_mean_systolic),
            grp_phase_sem = grp_phase_sd_systolic / sqrt(num_mice)) 

```

## Assess BP of randomly assigned groups before getting treatment

```{r plot_groupwise}
ggplot(data = phase_wise_bp_and_sem, 
       aes(fill = `group`, x = group, y = grp_phase_mean_systolic)) +
  
  geom_bar(position = "dodge", stat = "identity", width = 0.5) +
  # stat_summary(fun ="mean", color = "cyan", show.legend = F) +
  
  geom_point(group_wise_bp_and_sem, 
             mapping = aes(x = group, y = specimen_grp_phase_mean_systolic, shape = `Specimen Name`), 
             show.legend = TRUE, inherit.aes = FALSE) + 
  scale_shape_manual(values = shape_ids) +
  
  # text by group and phase
  geom_label(data = phase_wise_bp_and_sem,
             aes(x = `group`, y = max(grp_phase_mean_systolic) + 35,
                 label = round(grp_phase_mean_systolic,1)),
             hjust = 0.5, color = "black", size = 3,
             inherit.aes = FALSE, show.legend = F) +
  
  geom_errorbar(data = phase_wise_bp_and_sem,
                aes(x = group, ymin=`grp_phase_mean_systolic`-grp_phase_sem, 
                    ymax=`grp_phase_mean_systolic`+grp_phase_sem), width=.2,
                position=position_dodge(0.05), show.legend = F, inherit.aes = FALSE) +
  # change fill colors
  scale_fill_manual(name = "group", values = levels(specimen_avg_data_df$color)) +
  my_theme +
  # scale_y_continuous(breaks=seq(90, 220, 10)) +
  theme(axis.text.x = element_text(angle = 60, hjust=1),
        axis.text.y = element_text(hjust = 0.5)) +
  ylab("Average Systolic BP") +
  ggtitle("Average Systolic BP across phases") +
  facet_grid(cols = vars(Phase), scales = "free_x") 

```


### 3-day rolling average
##### Take the last 3-days of an interval and average them for more realistic averages
```{r three_day}

three_day_df <- specimen_avg_data_df %>%
  group_by(`Specimen Name`, Phase) %>%
  mutate(n = n():1) %>%
  filter(n %in% c(1:3))

three_day_group_df <- three_day_df %>%
  group_by(`Specimen Name`, group, Phase) %>% 
  summarize(specimen_grp_phase_mean_systolic = mean(specimen_mean_systolic),
            specimen_grp_phase_sd_systolic = sd(specimen_mean_systolic),
            specimen_grp_phase_sem = specimen_grp_phase_sd_systolic / sqrt(num_mice)) 

three_day_phase_df <- three_day_df %>%
  group_by(group, Phase) %>% 
  summarize(grp_phase_mean_systolic = mean(specimen_mean_systolic),
            grp_phase_sd_systolic = sd(specimen_mean_systolic),
            grp_phase_sem = grp_phase_sd_systolic / sqrt(num_mice)) 

ggplot(data = three_day_phase_df, 
       aes(fill = `group`, x = group, y = grp_phase_mean_systolic)) +
  
  geom_bar(position = "dodge", stat = "identity", width = 0.5) +
  # stat_summary(fun ="mean", color = "cyan", show.legend = F) +
  
  geom_point(three_day_group_df, 
             mapping = aes(x = group, y = specimen_grp_phase_mean_systolic, shape = `Specimen Name`), 
             show.legend = TRUE, inherit.aes = FALSE) + 
  scale_shape_manual(values = shape_ids) +
  
  # text by group and phase
  geom_label(data = three_day_phase_df,
             aes(x = `group`, y = max(grp_phase_mean_systolic) + 35,
                 label = round(grp_phase_mean_systolic,1)),
             hjust = 0.5, color = "black", size = 3,
             inherit.aes = FALSE, show.legend = F) +
  
  geom_errorbar(data = three_day_phase_df,
                aes(x = group, ymin=`grp_phase_mean_systolic`-grp_phase_sem, 
                    ymax=`grp_phase_mean_systolic`+grp_phase_sem), width=.2,
                position=position_dodge(0.05), show.legend = F, inherit.aes = FALSE) +
  # change fill colors
  scale_fill_manual(name = "group", values = levels(three_day_df$color)) +
  my_theme +
  
  scale_y_continuous(breaks=seq(40, 175, 10)) +
  theme(axis.text.x = element_text(angle = 60, hjust=1),
        axis.text.y = element_text(hjust = 0.5)) +
  ylab("Average Systolic BP") +
  ggtitle("Last-most 3-day averages SBP across Phases") +
  facet_grid(cols = vars(Phase), scales = "free_x") 

```


### Time-series data
For completeness, here is the time series data of each mouse across each Phase of the experiment:
```{r time, fig.width=15, fig.height=10, fig.width=18}
plot_time_series_dates_add <- specimen_avg_data_df %>%
  group_by(Phase) %>%
  arrange(Phase, Date) %>% # critical to sort because we use first and last next
  summarize(start = as.Date(first(Date)), end = as.Date(last(Date)) + 1) %>%  # just add 1 day!
  mutate(date_range = map2_chr(start, end, function(f,l){
    res <- str_c(c(as.character(f), as.character(l)), collapse = " to ")
    return(res)
  })) %>%
  mutate(mid_s = (start - end)/2,
         mid = end + mid_s)

plot_time_series_df <- specimen_avg_data_df %>%
  left_join(plot_time_series_dates_add) %>%
  left_join(meta_df_w_assign %>% dplyr::select(`Specimen Name`, `Machine ID`)) %>%
  mutate(`Machine ID` = factor(`Machine ID`))


ggplot(plot_time_series_df,
       aes(x = `Date`,
           y = specimen_mean_systolic, color = `group`, shape = `Specimen Name`)) +
  scale_shape_manual(values = shape_ids) +
  geom_point(size = 8) +
  geom_line() + 
  # geom_line(show.legend = FALSE) +
  #   geom_smooth(aes(linetype = `Specimen Name`), method = "lm", formula = y ~ splines::bs(x, 5),
  #             size = 3, se = FALSE) +
  geom_smooth(aes(x = `Date`, y = specimen_mean_systolic, group = group, color = group),
              method = "lm", formula = y ~ splines::bs(x, 4),
              size = 3, se = FALSE, inherit.aes = FALSE) +
  
  geom_rect(data = plot_time_series_dates_add, inherit.aes = F,
            aes(xmin = as.Date(start), xmax = as.Date(end), ymin = -Inf, ymax = Inf, fill = Phase),
            alpha = 0.3, show.legend = F) +
  
  geom_label(data = plot_time_series_dates_add,
             aes(x = mid, y = min(plot_time_series_df$specimen_mean_systolic),
                 label = Phase), inherit.aes = FALSE, show.legend = FALSE, size = 6) +
  
  geom_vline(aes(xintercept = start),
             data = plot_time_series_dates_add,
             colour = "grey50", alpha = 0.5, show.legend = FALSE) +
  
  geom_errorbar(aes(ymin = specimen_mean_systolic - specimen_sem,
                    ymax = specimen_mean_systolic + specimen_sem), width=.2,
                position=position_dodge(0.05), show.legend = FALSE) +
  my_theme + 
  scale_fill_manual(values = c("gray", "#1e81b0", "#ed9942", "darkorange1", "red")) + 
  scale_color_manual(name = "group", values = levels(specimen_avg_data_df$color)) +
  
  scale_y_continuous(breaks=seq(40,250,10)) +
  scale_x_date(breaks=unique(specimen_avg_data_df$Date)) +
  theme(axis.text.x = element_text(angle = 60, hjust=1),
        axis.text.y = element_text(hjust = 0.5)) +
  labs(title = "Time-series plot of Average Tail-cuff Systolic BP measurements") +
  ylab("Systolic blood pressure (mmHg)")


# cumsum_plot_time_series_df <- plot_time_series_df %>%
#   group_by(`Specimen Name`, Phase, color) %>% 
#   summarize(tsum = cumsum(specimen_mean_systolic)) %>%
#   bind_cols(plot_time_series_df %>% 
#               select(-c(`Specimen Name`, Phase, color)))
# 
# 
# ggplot(cumsum_plot_time_series_df, aes(x = Date, y = tsum, color = group)) + geom_line()
```

```{r avg_time_series}
time_series_avg_by_group_phase <- specimen_avg_data_df %>%
  group_by(group, Phase, Date) %>%
  summarize(group_phase_avg_systolic = mean(specimen_mean_systolic),
            group_phase_sem = sd(specimen_mean_systolic)/sqrt(num_mice))

ggplot(time_series_avg_by_group_phase,
       aes(x = `Date`,
           y = group_phase_avg_systolic, color = `group`)) +
  geom_point(show.legend = F) +
  geom_line()+
  geom_line(aes(group = group, color = group), stat="smooth", method = "lm",
            size = 1.5,
            linetype ="dashed",
            alpha = 0.7) +
  # scale_color_jco() +
  my_theme +
  geom_errorbar(aes(ymin = group_phase_avg_systolic - group_phase_sem,
                    ymax = group_phase_avg_systolic + group_phase_sem), width=.2,
                position=position_dodge(0.05))+
  # scale_y_continuous(breaks=seq(70,220,10)) +
  scale_x_date(breaks=unique(specimen_avg_data_df$Date)) +
  theme(axis.text.x = element_text(angle = 60, hjust=1),
        axis.text.y = element_text(hjust = 0.5)) +
  labs(title = "Time-series plot of Average Tail-cuff Systolic BP measurements") +
  ylab("Systolic blood pressure (mmHg)") +
  scale_color_manual(name = "group", values = levels(specimen_avg_data_df$color)) +
  facet_grid(cols = vars(Phase),  scales = "free_x")
```

# Time-series diff by individual mouse from vehicle average to end of treatment
```{r time_series_diff}
time_series_avg_diff_by_mouse_temp1 <- specimen_avg_data_df %>%
  filter(!(Phase %in% c("training", "vehicle"))) %>%
  mutate(Date = as.factor(Date))

vehicle_avg_3day <- specimen_avg_data_df %>%
  filter(Phase == "vehicle") %>%
  group_by(`Specimen Name`, group, color, Phase) %>%
  summarize(specimen_mean_systolic = mean(specimen_mean_systolic),
            specimen_sem_systolic  = sd(specimen_mean_systolic)/sqrt(num_mice)) %>%
  mutate(Date = as.factor(cutoff_dates_df %>% filter(Phase == "vehicle") %>% pluck("date_range")))

new_levels <- c(levels(vehicle_avg_3day$Date[1]), levels(time_series_avg_diff_by_mouse_temp1$Date))

time_series_avg_diff_by_mouse <- bind_rows(vehicle_avg_3day, time_series_avg_diff_by_mouse_temp1) %>%
  mutate(Date = factor(Date, levels = new_levels)) %>%
  group_by(`Specimen Name`) %>%
  mutate(time_series_avg_diff_per_mouse = specimen_mean_systolic - specimen_mean_systolic[1L],
         time_series_sem_diff_per_mouse = specimen_sem - specimen_sem[1L]) %>%
  arrange(Phase) %>%
  dplyr::select(-c(specimen_mean_systolic, 
                   specimen_sem_systolic, specimen_median_systolic, 
                   specimen_sd_systolic, specimen_sem))


ggplot(time_series_avg_diff_by_mouse,
       aes(x = `Date`,
           y = time_series_avg_diff_per_mouse, group = `Specimen Name`, shape = `Specimen Name`, color = `group`)) +
  geom_point(show.legend = T) +
  geom_line()+
  # geom_line(aes(group = group, color = group), stat="smooth", method = "lm",
  #           size = 1.5,
  #           linetype ="dashed",
  #           alpha = 0.7) +
  # scale_color_jco() +
  my_theme +
  geom_errorbar(aes(ymin = time_series_avg_diff_per_mouse - time_series_sem_diff_per_mouse,
                    ymax = time_series_avg_diff_per_mouse + time_series_sem_diff_per_mouse), width=.2,
                position=position_dodge(0.05))+
  scale_y_continuous(breaks=seq(-50,150,10)) +
  scale_shape_manual(values = shape_ids) +
  # scale_x_date(breaks=unique(specimen_avg_data_df$Date)) +
  theme(axis.text.x = element_text(angle = 60, hjust=1),
        axis.text.y = element_text(hjust = 0.5)) +
  labs(title = "*Difference* per mouse from vehicle baseline\nTail-cuff Systolic BP measurements") +
  ylab("Systolic blood pressure (mmHg)") +
  scale_color_manual(name = "group", values = levels(specimen_avg_data_df$color)) 


ggplot(time_series_avg_diff_by_mouse %>% filter(Date %in% c(levels(vehicle_avg_3day$Date[1]), cutoff_dates_df$last)),
       aes(x = `Date`,
           y = time_series_avg_diff_per_mouse, group = `Specimen Name`, shape = `Specimen Name`, color = `group`)) +
  geom_point(show.legend = T) +
  geom_line()+
  # geom_line(aes(group = group, color = group), stat="smooth", method = "lm",
  #           size = 1.5,
  #           linetype ="dashed",
  #           alpha = 0.7) +
  # scale_color_jco() +
  my_theme +
  geom_errorbar(aes(ymin = time_series_avg_diff_per_mouse - time_series_sem_diff_per_mouse,
                    ymax = time_series_avg_diff_per_mouse + time_series_sem_diff_per_mouse), width=.2,
                position=position_dodge(0.05))+
  scale_y_continuous(breaks=seq(-50,150,10)) +
  scale_shape_manual(values = shape_ids) +
  # scale_x_date(breaks=unique(specimen_avg_data_df$Date)) +
  theme(axis.text.x = element_text(angle = 60, hjust=1),
        axis.text.y = element_text(hjust = 0.5)) +
  labs(title = "*Difference* per mouse from vehicle baseline\nTail-cuff Systolic BP measurements") +
  ylab("Systolic blood pressure (mmHg)") +
  scale_color_manual(name = "group", values = levels(specimen_avg_data_df$color)) 




time_series_avg_diff <- time_series_avg_diff_by_mouse %>%
  group_by(group, color, Date) %>%
  summarize(time_series_avg_diff = mean(time_series_avg_diff_per_mouse),
            time_series_sem_diff = sd(time_series_sem_diff_per_mouse)/sqrt(num_mice))

ggplot(time_series_avg_diff,
       aes(x = `Date`,
           y = time_series_avg_diff, group = `group`, color = `group`)) +
  geom_point(show.legend = T) +
  geom_line()+
  # geom_line(aes(group = group, color = group), stat="smooth", method = "lm",
  #           size = 1.5,
  #           linetype ="dashed",
  #           alpha = 0.7) +
  # scale_color_jco() +
  my_theme +
  geom_errorbar(aes(ymin = time_series_avg_diff - time_series_sem_diff,
                    ymax = time_series_avg_diff + time_series_sem_diff), width=.2,
                position=position_dodge(0.05))+
  scale_y_continuous(breaks=seq(-50,150,10)) +
  theme(axis.text.x = element_text(angle = 60, hjust=1),
        axis.text.y = element_text(hjust = 0.5)) +
  labs(title = "*Difference* averaged over groups from vehicle baseline\nTail-cuff Systolic BP measurements") +
  ylab("Systolic blood pressure (mmHg)") +
  scale_color_manual(name = "group", values = levels(specimen_avg_data_df$color)) 



time_series_avg_diff <- time_series_avg_diff_by_mouse %>%
  group_by(group, color, Date) %>%
  summarize(time_series_avg_diff = mean(time_series_avg_diff_per_mouse),
            time_series_sem_diff = sd(time_series_sem_diff_per_mouse)/sqrt(num_mice))

ggplot(time_series_avg_diff %>% filter(Date %in% c(levels(vehicle_avg_3day$Date[1]), cutoff_dates_df$last)),
       aes(x = `Date`,
           y = time_series_avg_diff, group = `group`, color = `group`)) +
  geom_point(show.legend = T) +
  geom_line()+
  # geom_line(aes(group = group, color = group), stat="smooth", method = "lm",
  #           size = 1.5,
  #           linetype ="dashed",
  #           alpha = 0.7) +
  # scale_color_jco() +
  my_theme +
  geom_errorbar(aes(ymin = time_series_avg_diff - time_series_sem_diff,
                    ymax = time_series_avg_diff + time_series_sem_diff), width=.2,
                position=position_dodge(0.05))+
  scale_y_continuous(breaks=seq(-50,150,10)) +
  theme(axis.text.x = element_text(angle = 60, hjust=1),
        axis.text.y = element_text(hjust = 0.5)) +
  labs(title = "*Difference* averaged over groups from vehicle baseline\nTail-cuff Systolic BP measurements") +
  ylab("Systolic blood pressure (mmHg)") +
  scale_color_manual(name = "group", values = levels(specimen_avg_data_df$color)) 
```

# Time-series diff by group

<!-- ### Treatment with Sunitinib start on 03/15/21 BP (recording of effect starts on 03/16/21) -->
  <!-- Since BP is recorded before gavage to reduce stress -->
These are the average blood pressures of each mice across each Phase of the experiment

```{r treatment, echo = FALSE, message=FALSE, warning=FALSE,fig.width=18}

# cutoff dates calculated early on
names_chr <- cutoff_dates_df %>% pluck("Phase")
ranges_chr <- cutoff_dates_df %>% pluck("date_range")
formatted_ranges <- str_c(names_chr, ranges_chr, sep = ": ", collapse = "\n")

# see lots of dots because outliers are removed per DAY not as a function of all of the data across days
# should this change?
ggplot(final_filtered_data, aes(x = `Specimen Name`, y = `Systolic`, group = `Specimen Name`, fill = `group`)) +
  geom_boxplot(position = "dodge", alpha=0.8) +
  stat_summary(fun.y="mean", color = "cyan", show.legend = F) +
  scale_y_continuous(breaks=seq(70,250,10)) +
  # scale_fill_jco() +
  scale_fill_manual(name = "group", values = levels(final_filtered_data$color)) +
  labs(title = "Boxplots of Mice Systolic BP by Tail-cuff by Phase",
       subtitle = formatted_ranges) +
  ylab("Systolic blood pressure (mmHg)") +
  my_theme +
  theme(axis.title.y = element_text(hjust = 0.5)) +
  geom_label(data = group_wise_bp_and_sem,
             aes(label = round(specimen_grp_phase_mean_systolic,1),
                 x = `Specimen Name`, hjust = 0.5, y = 200),
             inherit.aes = FALSE) +
  facet_grid(cols = vars(Phase))

```

# Booklet usage instructions
## Download excel data after finishing the experiments onto thumb drive
Copy the data into a *master* excel file with two sheets, making sure that there's only one header (at the very top of the page).

  1. You'll need to add in two columns manually to this master sheet: *Date* and *Phase*. Phase can take one of 4 values: "training", "baseline", "vehicle", "treatment". Training data ultimately gets removed, but included in the data sheet for completeness. First sheet should look like this:
```{r data_shape, echo = FALSE, message=FALSE, warning=FALSE}

header_to_edit <- colnames(data_temp)[6]
kable(head(data_temp, n = 6), caption = "Metadata", align = "c") %>%
  column_spec(column = c(ncol(data_temp)-1, ncol(data_temp)), background = "greenyellow") %>%
  kable_styling("striped")

```
2. Second sheet should be created manually based on your mice. Fill in the various fields.
```{r data_shape2, echo = FALSE, message=FALSE, warning=FALSE}
kable(head(meta_df_all %>% arrange(`Date`, `Specimen Name`), n = 10), caption = "Metadata", align = "c") %>%
  kable_styling("striped")
```



